DenseBox: Unifying Landmark Localization with End to End Object Detection

نویسندگان

  • Lichao Huang
  • Yi Yang
  • Yafeng Deng
  • Yinan Yu
چکیده

How can a single fully convolutional neural network (FCN) perform on object detection? We introduce DenseBox, a unified end-to-end FCN framework that directly predicts bounding boxes and object class confidences through all locations and scales of an image. Our contribution is two-fold. First, we show that a single FCN, if designed and optimized carefully, can detect multiple different objects extremely accurately and efficiently. Second, we show that when incorporating with landmark localization during multi-task learning, DenseBox further improves object detection accuray. We present experimental results on public benchmark datasets including MALF face detection and KITTI car detection, that indicate our DenseBox is the state-of-the-art system for detecting challenging objects such as faces and cars.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual EKF-SLAM from Heterogeneous Landmarks

Many applications require the localization of a moving object, e.g., a robot, using sensory data acquired from embedded devices. Simultaneous localization and mapping from vision performs both the spatial and temporal fusion of these data on a map when a camera moves in an unknown environment. Such a SLAM process executes two interleaved functions: the front-end detects and tracks features from...

متن کامل

Regressing Heatmaps for Multiple Landmark Localization Using CNNs

We explore the applicability of deep convolutional neural networks (CNNs) for multiple landmark localization in medical image data. Exploiting the idea of regressing heatmaps for individual landmark locations, we investigate several fully convolutional 2D and 3D CNN architectures by training them in an end-to-end manner. We further propose a novel SpatialConfiguration-Net architecture that effe...

متن کامل

Deep Deformation Network for Object Landmark Localization

We propose a novel cascaded framework, namely deep deformation network (DDN), for localizing landmarks in non-rigid objects. The hallmarks of DDN are its incorporation of geometric constraints within a convolutional neural network (CNN) framework, ease and efficiency of training, as well as generality of application. A novel shape basis network (SBN) forms the first stage of the cascade, whereb...

متن کامل

Joint Voxel and Coordinate Regression for Accurate 3D Facial Landmark Localization

3D face shape is more expressive and viewpointconsistent than its 2D counterpart. However, 3D facial landmark localization in a single image is challenging due to the ambiguous nature of landmarks under 3D perspective. Existing approaches typically adopt a suboptimal two-step strategy, performing 2D landmark localization followed by depth estimation. In this paper, we propose the Joint Voxel an...

متن کامل

Query-Adaptive R-CNN for Open-Vocabulary Object Detection and Retrieval

We address the problem of open-vocabulary object retrieval and localization, which is to retrieve and localize objects from a very large-scale image database immediately by a textual query (e.g., a word or phrase). We first propose Query-Adaptive R-CNN, a simple yet strong framework for open-vocabulary object detection. Query-Adaptive RCNN is a simple extension of Faster R-CNN from closedvocabu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1509.04874  شماره 

صفحات  -

تاریخ انتشار 2015